AITopics | von mise-fisher distribution

Collaborating Authors

von mise-fisher distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dependent nonparametric trees for dynamic hierarchical clustering

Kumar Avinava Dubey, Qirong Ho, Sinead A. Williamson, Eric P. Xing

Neural Information Processing SystemsFeb-9-2025, 04:42:59 GMT

Hierarchical clustering methods offer an intuitive and powerful way to model a wide variety of data sets. However, the assumption of a fixed hierarchy is often overly restrictive when working with data generated over a period of time: We expect both the structure of our hierarchy, and the parameters of the clusters, to evolve with time. In this paper, we present a distribution over collections of time-dependent, infinite-dimensional trees that can be used to model evolving hierarchies, and present an efficient and scalable algorithm for performing approximate inference in such a model. We demonstrate the efficacy of our model and inference algorithm on both synthetic data and real-world document corpora.

artificial intelligence, machine learning, node, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.71)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement

Zhu, Yiwen, Liu, Jinyi, Wei, Wenya, Fu, Qianyi, Hu, Yujing, Fang, Zhou, An, Bo, Hao, Jianye, Lv, Tangjie, Fan, Changjie

arXiv.org Artificial IntelligenceMay-14-2024

Reinforcement Learning (RL) is a widely employed technique in decision-making problems, encompassing two fundamental operations -- policy evaluation and policy improvement. Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency. However, when using multiple critics, the actor in the policy improvement process can obtain different gradients. Previous studies have combined these gradients without considering their disagreements. Therefore, optimizing the policy improvement process is crucial to enhance learning efficiency. This study focuses on investigating the impact of gradient disagreements caused by ensemble critics on policy improvement. We introduce the concept of uncertainty of gradient directions as a means to measure the disagreement among gradients utilized in the policy improvement process. Through measuring the disagreement among gradients, we find that transitions with lower uncertainty of gradient directions are more reliable in the policy improvement process. Building on this analysis, we propose a method called von Mises-Fisher Experience Resampling (vMFER), which optimizes the policy improvement process by resampling transitions and assigning higher confidence to transitions with lower uncertainty of gradient directions. Our experiments demonstrate that vMFER significantly outperforms the benchmark and is particularly well-suited for ensemble structures in RL.

algorithm, gradient direction, transition, (13 more...)

arXiv.org Artificial Intelligence

2405.08638

Country:

North America > United States > Indiana > Hamilton County > Fishers (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

A solution for the mean parametrization of the von Mises-Fisher distribution

Nonnenmacher, Marcel, Sahani, Maneesh

arXiv.org Machine LearningApr-10-2024

The von Mises-Fisher distribution as an exponential family can be expressed in terms of either its natural or its mean parameters. Unfortunately, however, the normalization function for the distribution in terms of its mean parameters is not available in closed form, limiting the practicality of the mean parametrization and complicating maximum-likelihood estimation more generally. We derive a second-order ordinary differential equation, the solution to which yields the mean-parameter normalizer along with its first two derivatives, as well as the variance function of the family. We also provide closed-form approximations to the solution of the differential equation. This allows rapid evaluation of both densities and natural parameters in terms of mean parameters. We show applications to topic modeling with mixtures of von Mises-Fisher distributions using Bregman Clustering.

approximation, mean parametrization, von mise-fisher distribution, (13 more...)

arXiv.org Machine Learning

2404.07358

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Dependent nonparametric trees for dynamic hierarchical clustering, Eric P. Xing

Neural Information Processing SystemsMar-13-2024, 08:47:19 GMT

algorithm, node, tree-structured stick-breaking process, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.71)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Leveraging Optimal Transport via Projections on Subspaces for Machine Learning Applications

Bonet, Clément

arXiv.org Artificial IntelligenceNov-23-2023

Optimal Transport has received much attention in Machine Learning as it allows to compare probability distributions by exploiting the geometry of the underlying space. However, in its original formulation, solving this problem suffers from a significant computational burden. Thus, a meaningful line of work consists at proposing alternatives to reduce this burden while still enjoying its properties. In this thesis, we focus on alternatives which use projections on subspaces. The main such alternative is the Sliced-Wasserstein distance, which we first propose to extend to Riemannian manifolds in order to use it in Machine Learning applications for which using such spaces has been shown to be beneficial in the recent years. We also study sliced distances between positive measures in the so-called unbalanced OT problem. Back to the original Euclidean Sliced-Wasserstein distance between probability measures, we study the dynamic of gradient flows when endowing the space with this distance in place of the usual Wasserstein distance. Then, we investigate the use of the Busemann function, a generalization of the inner product in metric spaces, in the space of probability measures. Finally, we extend the subspace detour approach to incomparable spaces using the Gromov-Wasserstein distance.

data mining, large language model, machine learning, (24 more...)

arXiv.org Artificial Intelligence

2311.13883

Country:

North America > United States (1.00)
Europe > France (0.27)
Europe > United Kingdom (0.27)
(7 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.67)
Research Report > Experimental Study (0.45)
Overview > Innovation (0.45)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Health & Medicine > Health Care Technology (0.92)
Health & Medicine > Diagnostic Medicine > Imaging (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
(3 more...)

Add feedback

Towards Robust Feature Learning with t-vFM Similarity for Continual Learning

Gao, Bilan, Kim, YoungBin

arXiv.org Artificial IntelligenceJun-4-2023

Continual learning has been developed using standard supervised contrastive loss from the perspective of feature learning. Due to the data imbalance during the training, there are still challenges in learning better representations. In this work, we suggest using a different similarity metric instead of cosine similarity in supervised contrastive loss in order to learn more robust representations. We validate the our method on one of the image classification datasets Seq-CIFAR-10 and the results outperform recent continual learning baselines. Continual learning is the research direction that mainly tackles the problem of catastrophic forgetting while a model learns new data consistently.

artificial intelligence, learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2306.02335

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > United States (0.04)
Asia > China > Heilongjiang Province > Daqing (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A General Framework for Visualizing Embedding Spaces of Neural Survival Analysis Models Based on Angular Information

Chen, George H.

arXiv.org Artificial IntelligenceMay-11-2023

We propose a general framework for visualizing any intermediate embedding representation used by any neural survival analysis model. Our framework is based on so-called anchor directions in an embedding space. We show how to estimate these anchor directions using clustering or, alternatively, using user-supplied "concepts" defined by collections of raw inputs (e.g., feature vectors all from female patients could encode the concept "female"). For tabular data, we present visualization strategies that reveal how anchor directions relate to raw clinical features and to survival time distributions. We then show how these visualization ideas extend to handling raw inputs that are images. Our framework is built on looking at angles between vectors in an embedding space, where there could be "information loss" by ignoring magnitude information. We show how this loss results in a "clumping" artifact that appears in our visualizations, and how to reduce this information loss in practice.

anchor direction, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.06862

Country:

Europe > Netherlands > South Holland > Rotterdam (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Mixture of von Mises-Fisher distribution with sparse prototypes

Rossi, Fabrice, Barbaro, Florian

arXiv.org Artificial IntelligenceDec-30-2022

Mixtures of von Mises-Fisher distributions can be used to cluster data on the unit hypersphere. This is particularly adapted for high-dimensional directional data such as texts. We propose in this article to estimate a von Mises mixture using a l 1 penalized likelihood. This leads to sparse prototypes that improve clustering interpretability. We introduce an expectation-maximisation (EM) algorithm for this estimation and explore the trade-off between the sparsity term and the likelihood one with a path following algorithm. The model's behaviour is studied on simulated data and, we show the advantages of the approach on real data benchmark. We also introduce a new data set on financial reports and exhibit the benefits of our method for exploratory analysis.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neucom.2022.05.118

2212.14591

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Banking & Finance > Trading (0.92)
Banking & Finance > Financial Services (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback

Score Matching for Truncated Density Estimation on a Manifold

Williams, Daniel J., Liu, Song

arXiv.org Machine LearningJun-29-2022

When observations are truncated, we are limited to an incomplete picture of our dataset. Recent methods deal with truncated density estimation problems by turning to score matching, where the access to the intractable normalising constant is not required. We present a novel extension to truncated score matching for a Riemannian manifold. Applications are presented for the von Mises-Fisher and Kent distributions on a two dimensional sphere in $\R^3$, as well as a real-world application of extreme storm observations in the USA. In simulated data experiments, our score matching estimator is able to approximate the true parameter values with a low estimation error and shows improvements over a maximum likelihood estimator.

artificial intelligence, machine learning, truncated density estimation, (11 more...)

arXiv.org Machine Learning

2206.14668

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Bristol (0.04)
Atlantic Ocean (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)

Add feedback

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

Brümmer, Niko, Swart, Albert, Mošner, Ladislav, Silnova, Anna, Plchot, Oldřich, Stafylakis, Themos, Burget, Lukáš

arXiv.org Machine LearningMar-28-2022

In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring backends are commonly used, namely cosine scoring or PLDA. Both have advantages and disadvantages, depending on the context. Cosine scoring follows naturally from the spherical geometry, but for PLDA the blessing is mixed -- length normalization Gaussianizes the between-speaker distribution, but violates the assumption of a speaker-independent within-speaker distribution. We propose PSDA, an analogue to PLDA that uses Von Mises-Fisher distributions on the hypersphere for both within and between-class distributions. We show how the self-conjugacy of this distribution gives closed-form likelihood-ratio scores, making it a drop-in replacement for PLDA at scoring time. All kinds of trials can be scored, including single-enroll and multi-enroll verification, as well as more complex likelihood-ratios that could be used in clustering and diarization. Learning is done via an EM-algorithm with closed-form updates. We explain the model and present some first experiments.

artificial intelligence, machine learning, plda, (14 more...)

arXiv.org Machine Learning

2203.14893

Country:

Europe > Czechia > South Moravian Region > Brno (0.05)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Greece > Attica > Athens (0.04)
(3 more...)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback